96 research outputs found

    CNN-RNN based method for license plate recognition

    Get PDF
    Achieving good recognition results for License plates is challenging due to multiple adverse factors. For instance, in Malaysia, where private vehicle (e.g., cars) have numbers with dark background, while public vehicle (taxis/cabs) have numbers with white background. To reduce the complexity of the problem, we propose to classify the above two types of images such that one can choose an appropriate method to achieve better results. Therefore, in this work, we explore the combination of Convolutional Neural Networks (CNN) and Recurrent Neural Networks namely, BLSTM (Bi-Directional Long Short Term Memory), for recognition. The CNN has been used for feature extraction as it has high discriminative ability, at the same time, BLSTM has the ability to extract context information based on the past information. For classification, we propose Dense Cluster based Voting (DCV), which separates foreground and background for successful classification of private and public. Experimental results on live data given by MIMOS, which is funded by Malaysian Government and the standard dataset UCSD show that the proposed classification outperforms the existing methods. In addition, the recognition results show that the recognition performance improves significantly after classification compared to before classification

    Exact string matching algorithms : survey, issues, and future research directions

    Get PDF
    String matching has been an extensively studied research domain in the past two decades due to its various applications in the fields of text, image, signal, and speech processing. As a result, choosing an appropriate string matching algorithm for current applications and addressing challenges is difficult. Understanding different string matching approaches (such as exact string matching and approximate string matching algorithms), integrating several algorithms, and modifying algorithms to address related issues are also difficult. This paper presents a survey on single-pattern exact string matching algorithms. The main purpose of this survey is to propose new classification, identify new directions and highlight the possible challenges, current trends, and future works in the area of string matching algorithms with a core focus on exact string matching algorithms. © 2013 IEEE

    Anomaly Detection in Natural Scene Images Based on Enhanced Fine-Grained Saliency and Fuzzy Logic

    Get PDF
    This paper proposes a simple yet effective method for anomaly detection in natural scene images improving natural scene text detection and recognition. In the last decade, there has been significant progress towards text detection and recognition in natural scene images. However, in cases where there are logos, company symbols, or other decorative elements for text, existing methods do not perform well. This work considers such misclassified components, which are part of the text as anomalies, and presents a new idea for detecting such anomalies in the text for improving text detection and recognition in natural scene images. The proposed method considers the result of the existing text detection method as input for segmenting characters or components based on saliency map and rough set theory. For each segmented component, the proposed method extracts feature from the saliency map based on density, pixel distribution, and phase congruency to classify text and non-text components by exploring a fuzzy-based classifier. To verify the effectiveness of the method, we have performed experiments on several benchmark datasets of natural scene text detection, namely, MSRATD-500 and SVT. Experimental results show the efficacy of the proposed method over the existing ones for text detection and recognition in these datasets

    An Automatic Zone Detection System for Safe Landing of UAVs

    Get PDF
    As the demand increases for the use Unmanned Aerial Vehicles (UAVs) to monitor natural disasters, protecting territories, spraying, vigilance in urban areas, etc., detecting safe landing zones becomes a new area that has gained interest. This paper presents an intelligent system for detecting regions to navigate a UAV when it requires an emergency landing due to technical causes. The proposed system explores the fact that safe regions in images have flat surfaces, which are extracted using the Gabor Transform. This results in images of different orientations. The proposed system then performs histogram operations on different Gabor-oriented images to select pixels that contribute to the highest peak, as Candidate Pixels (CP), for the respective Gabor-oriented images. Next, to group candidate pixels as one region, we explore Markov Chain Codes (MCCs), which estimate the probability of pixels being classified as candidates with neighboring pixels. This process results in Candidate Regions (CRs) detection. For each image of the respective Gabor orientation, including CRs, the proposed system finds a candidate region that has the highest area and considers it as a reference. We then estimate the degree of similarity between the reference CR with corresponding CRs in the respective Gabor-oriented images using a Chi square distance measure. Furthermore, the proposed system chooses the CR which gives the highest similarity to the reference CR to fuse with that reference, which results in the establishment of safe landing zones for the UAV. Experimental results on images from different situations for safe landing detection show that the proposed system outperforms the existing systems. Furthermore, experimental results on relative success rates for different emergency conditions of UAVs show that the proposed intelligent system is effective and useful compared to the existing UAV safe landing systems

    A new image size reduction model for an efficient visual sensor network

    Get PDF
    Image size reduction for energy-efficient transmission without losing quality is critical in Visual Sensor Networks (VSNs). The proposed method finds overlapping regions using camera locations, which eliminate unfocussed regions from the input images. The sharpness for the overlapped regions is estimated to find the Dominant Overlapping Region (DOR). The proposed model partitions further the DOR into sub-DORs according to capacity of the cameras. To reduce noise effects from the sub-DOR, we propose to perform a Median operation, which results in a Compressed Significant Region (CSR). For non-DOR, we obtain Sobel edges, which reduces the size of the images down to ambinary form. The CSR and Sobel edges of the non-DORs are sent by a VSN. Experimental results and a comparative study with the state-of-the-art methods shows that the proposed model outperforms the existing methods in terms of quality, energy consumption and network lifetime

    A deep action-oriented video image classification system for text detection and recognition

    Get PDF
    For the video images with complex actions, achieving accurate text detection and recognition results is very challenging. This paper presents a hybrid model for classification of action-oriented video images which reduces the complexity of the problem to improve text detection and recognition performance. Here, we consider the following five categories of genres, namely concert, cooking, craft, teleshopping and yoga. For classifying action-oriented video images, we explore ResNet50 for learning the general pixel-distribution level information and the VGG16 network is implemented for learning the features of Maximally Stable Extremal Regions and again another VGG16 is used for learning facial components obtained by a multitask cascaded convolutional network. The approach integrates the outputs of the three above-mentioned models using a fully connected neural network for classification of five action-oriented image classes. We demonstrated the efficacy of the proposed method by testing on our dataset and two other standard datasets, namely, Scene Text Dataset dataset which contains 10 classes of scene images with text information, and the Stanford 40 Actions dataset which contains 40 action classes without text information. Our method outperforms the related existing work and enhances the class-specific performance of text detection and recognition, significantly
    corecore